46 research outputs found

    Real-time event detection in massive streams

    Get PDF
    Grant award number EP/J020664/1New event detection, also known as first story detection (FSD), has become very popular in recent years. The task consists of finding previously unseen events from a stream of documents. Despite the apparent simplicity, FSD is very challenging and has applications anywhere where timely access to fresh information is crucial: from journalism to stock market trading, homeland security, or emergency response. With the rise of user generated content and citizen journalism we have entered an era of big and noisy data, yet traditional approaches for solving FSD are not designed to deal with this new type of data. The amount of information that is being generated today exceeds by many orders of magnitude previously available datasets, making traditional approaches obsolete for modern event detection. In this thesis, we propose a modern approach to event detection that scales to unbounded streams of text, without sacrificing accuracy. This is a crucial property that enables us to detect events from large streams like Twitter, which none of the previous approaches were able to do. One of the major problems in detecting new events is vocabulary mismatch, also known as lexical variation. This problem is characterized by different authors using different words to describe the same event, and it is inherent to human language. We show how to mitigate this problem in FSD by using paraphrases. Our approach that uses paraphrases achieves state-of-the-art results on the FSD task, while still maintaining efficiency and being able to process unbounded streams. Another important property of user generated content is the high level of noise, and Twitter is no exception. This is another problem that traditional approaches were not designed to deal with, and here we investigate different methods of reducing the amount of noise. We show that by using information from Wikipedia, it is possible to significantly reduce the amount of spurious events detected in Twitter, while maintaining a very small latency in detection. A question is often raised as to whether Twitter is at all useful, especially if one has access to a high-quality stream such as the newswire, or if it should be considered as sort of a poor man’s newswire. In our comparison of these two streams we find that Twitter contains events not present in the newswire, and that it also breaks some events sooner, showing that it is useful for event detection, even in the presence of newswire

    Can twitter replace newswire for breaking news?

    Get PDF
    Twitter is often considered to be a useful source of real-time news, potentially replacing newswire for this purpose. But is this true? In this paper, we examine the extent to which news reporting in newswire and Twitter overlap and whether Twitter often reports news faster than traditional newswire providers. In particular, we analyse 77 days worth of tweet and newswire articles with respect to both manually identified major news events and larger volumes of automatically identified news events. Our results indicate that Twitter reports the same events as newswire providers, in addition to a long tail of minor events ignored by mainstream media. However, contrary to popular belief, neither stream leads the other when dealing with major news events, indicating that the value that Twitter can bring in a news setting comes predominantly from increased event coverage, not timeliness of reporting

    Cut Length Distributions of Haylage Particles

    Get PDF
    Alfalfa is one of the most important crops for forage production. Traditional method of alfalfa conservation assumes hay preparation. However, nowadays it is also commonly processed in the form of silage and haylage. Physiological effects of forages that are included in diets depend on plant species, stage of maturity, method of preservation and diet composition. Physical characteristics of rations for ruminants are primarily influenced by dietary forage to concentrate ratio, type of forages and concentrates, and mean particle size of feeds. Length distribution of forage particles represents an important parameter for ruminant’s diet formulation, especially for dairy cattle. During silage production, harvest considerations should be focused to obtaining the adequate particle size distribution of the ensiling crop particles. This paper presents results of testing three contemporary types of self-propelled silage harvesters applied in the alfalfa haylage preparation: Claas Jaguar 950, Krone Big X 700 and Krone Big X 500. All machines were adapted with pick-up headers. In the study are analyzed length distributions of chopped alfalfa particles. Resulting frequency distributions of produced haylage are characterised by high mass percentage of the fraction comprehending the largest particles. It is also evident that harvester Class Jaguar 950 achieved the mean chopping length closest to preset value

    Combines Work Quality in Maize Silage Production

    Get PDF
    The paper presents testing results of three silage combines employed in maize silage preparation in Toplica region. It is focused on determination of technical working parameters of tested machines. Achieved results verified the superiority of silage combine John Deere 5820, which produced the chopped mass having particle lengths of the smallest deviation with respect to the preset cutting length. In this case, the average length of chopped mass was 9.9 mm, having 69 % mass in the range up to 8 mm. The other two silage combines produced lower mass percentage of this fraction and larger variations of particle lengths with respect to the preset length. Minimum mass flow rate was evidenced for the silage combine Fortschrit E-286: 7.3 kg s-1 (26.3 t h-1) and the surface productivity of 0.83 ha h-1, at the average speed of 4.0 km h-1. Maximum production rate was achieved with silage combine John Deere 5820: 10.9 kg s-1 (39.1 t h-1) at average working velocity of 4.7 km h-1 and surface efficiency of 1.21 ha h-1

    Streaming first story detection with application to Twitter

    Get PDF
    With the recent rise in popularity and size of social media, there is a growing need for systems that can extract useful information from this amount of data. We address the problem of detecting new events from a stream of Twitter posts. To make event detection feasible on web-scale corpora, we present an algorithm based on locality-sensitive hashing which is able overcome the limitations of traditional approaches, while maintaining competitive results. In particular, a comparison with a stateof-the-art system on the first story detection task shows that we achieve over an order of magnitude speedup in processing time, while retaining comparable performance. Event detection experiments on a collection of 160 million Twitter posts show that celebrity deaths are the fastest spreading news on Twitter.

    Comparison of Essential Metals in Different Pork Meat Cuts from the Serbian Market

    Get PDF
    AbstractPork consumption in Serbia accounts for a large share of total meat consumption. Pork is valuable sources of nutrients. We analyzed metal content in three different cuts of pork collected from the Serbian market during 2014. Analyses of the following isotopes: zinc (66Zn), copper (63Cu) and iron (57Fe) were performed by ICP-MS. Our data show that Zn, Cu and Fe were present in significantly different levels in hind leg, loin and shoulder, and that shoulder meat was richest in the analyzed metals. The differing mineral status of different pork cuts implies differences in their nutritional benefits for the human diet
    corecore